Enhancing Multi-agent Bargaining with the TD-based Reinforcement Learning Approach

نویسندگان

  • Shiu-li Huang
  • Fu-ren Lin
چکیده

This study proposes a negotiation mechanism that applies TD-based reinforcement learning to deal with on-line bargaining between two parties both with incomplete information. The agent embedded with the TD-based reinforcement learning capability can learn dynamic strategy incrementally by itself with the past bargaining experiences. This study investigates the scenarios that a TD-based seller agent bargains with buyers who have different risk-attitudes, and both seller and buyer agents gifted with the TD-based reinforcement learning capability to negotiate with each other. Bargaining experiments are conducted on JADE, a software agent framework based on the FIPA specifications, to evaluate the bargaining performance in average payoff and settlement rate. Results show that the negotiation mechanism handles the multi-agent bargaining process effectively. This study can be further applied to electronic commerce environment for on-line automated bargaining.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiagent Learning with Bargaining - A Game Theoretic Approach

Learning in the real world occurs when an agent, which perceives its current state and takes actions, interacts with the environment, which in return provides a positive or negative feedback. The field of reinforcement learning studies such processes and attempts to find policies that map states of the world to the actions of agents in order to maximize cumulative reward over the long run. In m...

متن کامل

Cooperative Cognitive Agents and Reinforcement Learning in Pursuit Game

This paper illustrates how a self-organizing cognitive architecture, known as TD-FALCON, can learn to function and cooperate in a dynamic environment. TD-FALCON learns the value functions of the stateaction space estimated through a temporal difference (TD) method. The learned value functions are then used to determine the optimal actions based on an action selection policy. To tackle a multi-a...

متن کامل

TD Models: Modeling the World at a Mixture of Time Scales

Temporal-diierence (TD) learning can be used not just to predict rewards, as is commonly done in reinforcement learning, but also to predict states, i.e., to learn a model of the world's dynamics. We present theory and algorithms for intermixing TD models of the world at diierent levels of temporal abstraction within a single structure. Such multi-scale TD models can be used in model-based rein...

متن کامل

Low-Area/Low-Power CMOS Op-Amps Design Based on Total Optimality Index Using Reinforcement Learning Approach

This paper presents the application of reinforcement learning in automatic analog IC design. In this work, the Multi-Objective approach by Learning Automata is evaluated for accommodating required functionalities and performance specifications considering optimal minimizing of MOSFETs area and power consumption for two famous CMOS op-amps. The results show the ability of the proposed method to ...

متن کامل

An Unsupervised Learning Method for an Attacker Agent in Robot Soccer Competitions Based on the Kohonen Neural Network

RoboCup competition as a great test-bed, has turned to a worldwide popular domains in recent years. The main object of such competitions is to deal with complex behavior of systems whichconsist of multiple autonomous agents. The rich experience of human soccer player can be used as a valuable reference for a robot soccer player. However, because of the differences between real and simulated soc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003